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TECHNICAL REPORT 


The 

RICIS 

Concept 


The University of Houston-Clear Lake established the Research Institute for 
Computing and Information systems in 1986 to encourage NASA Johnson Space 
Center and local industry to actively support research in the computing and 
information sciences. As part of this endeavor, UH-Clear Lake proposed a 
partnership with JSC to jointly define and manage an integrated program of research 
in advanced data processing technology needed for JSC’s main missions, including 
administrative, engineering and science responsibilities. JSC agre ed and entered i nto 
a three-year cooperative agreement with UH-Clear Lake beginning in May, 1 986, to 
jointly plan and execute such research through RICIS. Additionally, under 
Cooperativ e Agreement NCC 9-16, computi ng and educational facilities ar e shared 
by the two institutions to conduct the research. 

The mission of RICIS is to conduct, coordinate and disseminate research on 
computing and information systems among researchers, sponsors and users from 
UH-Clear Lake, NASA/JSC, and other research organizations. Within UH-Clear 
Lake, the mission is being implemented through interdisciplinary involvement of 
faculty and students from each of the four schools: Business, Education, Human 
Sciences and Humanities, and Natural and Applied Sciences. 

Other research organizations are involved via the “gateway” concept. UH-Clear 
Lake establishes relationships with other universities and research organizations, 
having common research interests, to provide additional sources of expertise to 
conduct needed research. 

A major role of RICIS is to find the best match of sponsors, researchers and 
research objectives to advance knowledge in the computing and information 
sciences. Working jointly with NASA/JSC, RICIS advises on research needs, 
recommends principals for conducting the research, provides technical and 
administrative support to coordinate the research, and integrates technical results 
into the cooperative goals of UH-Clear Lake and NAS A/ JSC. 
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Preface 


This document constitutes the first delivery, “Updated Survey Report, ” of the four 
deliveries scheduled for the second phase of RICIS contract 069, “Verification and 
Validation of Expert Systems Study.” This deliverable is an update to the 
“Revised Final Report,” delivered on October 31, 1990, which was the final delivery 
of the first phase of this contract. 
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Background 


The purpose of this deliverable is to report the state-of-the-practice in Verification 
and Validation (V&V) of Expert Systems (ESs) on current NASA and Industry 
applications. This is the first task of a series which has the ultimate purpose of 
ensuring that adequate ES V&V tools and techniques are available for Space Station 
Knowledge Based Systems development. 

The strategy for determining the state-of-the-practice is to check how well each of 
the known ES V&V issues are being addressed and to what extent they have 
impacted the development of Expert Systems. 

Note: This task does not attempt to prove or disprove whether Verification and 
Validation can or should be performed on Expert Systems. It is accepted that Ver- 
ification and Validation should be applied to all software systems, including Expert 
Systems. 


Background 1 
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Executive Summary 

Data from over sixty Expert System (ES) projects was collected through a written 
survey and/or interviews. Forty basic questions were asked, ranging over a variety of 
general topics such as the size of the ES and the difficulty in specifying require- 
ments. However, all the questions were designed to gather information about dif- 
ferent aspects of V&V, Significant results include the following points (see 
“Summary of Results” on page 8 for the actual percentages): 

1. In most cases, the ES was expected to be at least as accurate as the expert but 
often the ES was less accurate. 

2. All users estimated the ES to be less accurate than expected while half the devel- 
opers estimated the ES to be less accurate than expected. 

3. Less than half the systems had a requirements document. 

4. On average a quarter of the developers time was spent on V&V. 

5. While developers thought evaluating an expert system was of average difficulty, 
users unanimously thought it was hard. 

6. All V&V techniques were used, with each technique being relied upon, by at 
least one project, as the sole V&V technique used. 

7. The most often cited V&V problems were test coverage determination, know- 
ledge validation, and problem complexity. 

Based on an analysis of the survey results, several recommendations were formu- 
lated. These recommendations are: 

1. Develop suggested V&V requirements for ESs, that is, standard and guidelines 
V&V of ESs at each stage of development. 

2. Address the test coverage determination, knowledge validation, and problem 
complexity issues. 

3. Develop ways to make knowledge bases more easily modularized and easier to 
understand. 

4. Address the configuration management of expert systems. 

5. Develop criteria to classify an ES by intended use so that V&V requirements 
can be tailored to different types of ESs. 

6. Investigate ways to assist an expert in analyzing a knowledge base, possibly 
either through the use of analysis tools or higher level representations. 
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Survey Rationale 

It is widely claimed that Expert Systems have been not been subject to the same 
level of Verification and Validation as traditionally developed software. Some 
people feel that this lack of V&V continues because of a 'Vicious circle, * where 
nobody requires expert system V&V, so nobody does it. Consequently, since 
nobody knows how to do it, nobody requires it. There are two major reasons why 
the V&V process has not been documented: lack of a single life -cycle model, and 
technical differences between traditional software and expert systems. 

Most expert system development life-cycles rely on iterative prototypes to develop 
the system behavior. This approach does not lead to methodical capture and doc- 
umentation of the expected system behavior. Documented expectations, tradi- 
tionally captured in a requirements document, are essential in the V&V process: 
you can't do testing if you don't know what to test for! One goal of this survey is 
to understand how the expected behavior of current expert systems is communicated 
and evaluated, even if a formal requirements document was not developed. 

Expert Systems are typically composed of three parts: the knowledge base (KB), the 
inference engine, and the Interface code between the inference engine and the periph- 
eral devices (terminals, sensors, effectors, users, etc.). The inference engine and 
interface code are simply traditional software and should currently be V&Ved by 
accepted practices. This survey will help determine if these parts are V&Ved or 
whether, since they are part of an expert system, V&V is overlooked. 

The knowledge base is the only part of the Expert System that raises new' and 
unique issues. A set of the possible issues are: 

Issues primarily due to use of nonprocedural languages 

• Understandability and readability to support inspections 

• Testing coverage 

• Standard validation tests for inference engines 

• Real-time performance analysis 

Issues due to heuristic knowledge (difficulty in organizing) 

• Knowledge validation 

• Modularity Design 

Issues primarily due to solving new complex problems 

• Requirements 

• Certification 

Other issues 

• Uncertainty .Analysis 

• Inheritance Process Test and Analysis 

• Configuration Management 

One of the purposes of this survey is to find out if these identified possible issues 
actually cause problems in practice, and if so, how the issues are being handled. 


Survey Rationale ’ 3 



Updated Survey Report 


Purpose of the Questionnaires 

Some of the information for this survey can be captured fairly easily and is accom- 
plished through use of a questionnaire. The information captured this way includes: 

• Application information - What kind of problem does the system address?, 

What are the performance goals? 

• Expertise information - What was the relationship between the developers and 
expert(s)?, What is the performance level of the expert? 

• Development information - How was the system developed?, How big is the 
system? 

• Evaluation information - How was the system evaluated? 

• Performance information - How important is good performance?, How well is 
the ES performing? 


Purpose of the Interviews 

The questionnaire answers lead to an additional set of questions involving the V&V 
issues described earlier. The additional questions are greatly affected by the answers 
provided in top questionnaire, so it would be more efficient to derive the informa* 
tion through direct interviews than to generate a large number of secondary ques- 
tionnaires. The interviews attempt to uncover: 

• the real issues involved in ES V&V (in comparison with the known possible 
issues outlined above), 

• what is being done currently to address V&V (inspections, path testing, testing 
by the expert). 

• what makes users trust the ESs, if the ESs are indeed trusted. 

• what problems, unique to ESs, were encountered and possibly addressed during 
development and test. 

The interviews are also required because we expect that some people will not fill out 
the questionnaires. 
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Survey Administration 

This survey was designed so that the majority of the information would be gained 
from direct interviews with people. involved in ES projects. Several people from 
each project, including developers, users, and managers, were interviewed to get a 
realistic view of the projects. 

Several other activities were undertaken, both before and after the interview activity, 
to ensure that the results of the survey reflected the actual "state-of-the-practice". 
These activities included: 

Identifying candidate ES projects 

A list of projects to be contacted was created. The list included projects 
at NASA and IBM as well as projects from fields outside of the space 
industry. 

Developing survey qucstionnairc(s) 

To improve the chances of getting meaningful data from the question- 
naire activity, separate questionnaires were developed for developers and 
users. Each questionnaire includes a question to indicate if the answers 
are from a manager or non-manager. Questionnaires are listed in 
Appendix B, ‘Expert Systems Evaluation Questionnaire (Developer)” 
on page 38 and Appendix C, “Expert Systems Evaluation Questionnaire 
(User)” on page 46. 

Evaluating returned questionnaires 

Each questionnaire was evaluated to determine if project interviews 
would uncover more information. If a project was to be interviewed, 
the questionnaire results provided guidance on which topics would be 
the most useful to explore 

Summarizing inteniew/questionnaire results 

The summarized results of the questionnaire/interview activities are pre- 
sented in section “Summary of Results” on page 8. 

Recommendations 

Recommendations for further action, based on the information in 
“Summary of Results” on page 8 are provided in section 
“Recommendations” on page 23. 


ORIGINAL PAGE IS 
OF POOR QUALITY 
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Survey Questionnaires 

Different versions of the questionnaire were developed for developers and users of 
the expert system. In addition, responses were expected to be different between 
managers and non-managers, so an indication is included on each questionnaire. 


Information Gathered 

Several types of information are captured by the questionnaire. Each question in 
the questionnaire addresses at least one of the previous types of information. For 
each type of information, the subtopics and questions which provide information 
are listed. The question numbers are noted as (development question, user ques- 
tion). Questions not available on a questionnaire are indicated by a 

General Information 

Describes the general properties of the expert system, including the name 
(1, 1), a short description (4, 4), field of the problem (5, 5), and the type 
of problem to be solved (6, 6). Also captured are whether the survey 
taker was a manager (2, 2). 

Performance Criteria 

A major expertise issue is performance (probability that the results given 
are correct); specifically performance of the experts (10, 9), expected per- 
formance of the system (11, 10), and actual performance of the system 
(12, 1 1). Related to the performance issue is the amount of the problem 
space that the ES Is expected to cover (8, 7), and that it actually covers 
(9, 8). 

Requirements Definition 

Requirements definition information includes how the requirements are 
documented (13, -), the difficulty in determining the requirements (14, -), 
and the availability oHhe expert(s) to resolve requirements issues during 
development (17, -). Influencin g the performance issue is the number of 
experts (15, -), and whether the experts agree on the results obtained 
from the system (16, 21). It may also be useful to know if the expen (-, 
12) and/or the developerfs) (18, 13) are part of the user organization. 

Development Information 

Development information that we are concerned with includes the devel- 
opment life-cycle used (19, -), and what languages and tools were used 
to develop the system (20, -). The size of the system (22, -), the total 
effort required for development, (29, -), and the effort required to 
develop the different parts of the ES (21, -) indicate the difficulty of the 
development effort. The sensitivity of the system (24, -) will influence 
the difficulty of future maintenance activities. 

V&V Activities Performed 

The major information to be captured during this task is the current 
state-of-the-practice for V&V of ESs, including the kinds of V&V being 
attempted, both during (28, -) and after (33, 20) development, and how 
much of the development effort was spent on V&V (30, -). Detailed 
information is also gathered for V&V activities for Knowledge Structures 
(25, -), the Inference Engine (26, -), and the Interface Code (27, -). 
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Information about the difficulty of the V&V effort (35, 22), whether a 
separate group performed V&V, (31, -) and how much effort was 
expended on the independent V&V (32, 19), is also gathered. 

Whether the system is operational or prototype (3, 3), and the criticality 
of the system (37, 15) have an affect on the amount of V&V activities 
performed. 

V&V Issues Encountered 

If the state-of-the-practice is to be improved, the major issues that need 
to be addressed must be identified. One question (36, 23) directly asks 
whether each, the known issues was actually encountered. Additional 
questions find out more information about specific issues, including the 
existence of certainty factors (7, -), whether configuration management 
was performed (34, -), and the difficulty of implementing the expertise 
through the Knowledge Structures (23, -). User acceptance is the ulti- 
mate test of the V&V activities. The comparison between expected 
system use (39, 17) and actual system use (40, 18), the perceived reli- 
ability of the system (38, 16), and why the user is convinced that the 
system produces correct results (-, 14) are all indicators of user accept- 
ance. 


Human Factors 

The questionnaires were designed to capture as much accurate information as pos- 
sible. In an effort to accomplish this, the following human factors issues were taken 
into account: 

Questions shouid be understandable 

Questions should have as few ''technical" terms as possible to avoid con- 
fusion due to local usage. For questions that must have technical 
content, be sure to provide sufficient explanation. 

Choices worded positively 

Negatively worded choices may not get selected because the responder 
may feel there is something wrong with it. 

Meaningful questions 

The responder should feel that there is some purpose to the question. 

Make use of fill-in-the-blank questions 

The responder should not have to fill in long responses. Some questions 
can not have all possible responses enumerated, so the user should be 
able to specify his own choice. 
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Summary of Results 

The survey results are summarized in the following sections. The results are organ- 
ized according to the type of information, as organized in “Information Gathered” 
on page 6. The percentages in parentheses correspond to the results from the devel- 
oper and user questionnaire, respectively. If the question is not in one of the ques- 
tionnaires, the position is Filled with a 

General Information 

Most of the respondents were involved with Expert Systems which 
perform Diagnosis (45%, 80%), primarily in the Aerospace field 
(46%, 100%). The survey respondents were predominantly involved 
with development (93%). 

Performance Criteria 

(37%, 40%) estimated an actual accuracy of less than 90% and 
(48%, 60%) estimated an accuracy of less than 95%. Most (60%, 40%) 
estimated the problem space coverage between 60% and 95%. In com- 
paring the accuracy of the expert and the expert system, most expected 
the expert system to at least as accurate as the expert (78%, 80%) while 
the expert system often was estimated to be less accurate than expected 
(49%, 100%) and less accurate than the expert (44%, 80%). Note that 
the results show that users more often (than developers) cited the system 
as being less accurate than expert and less accurate than expected. 

Requirements Definition 

(75%,-) indicated that expert consultation was a basis for determining 
the behavior of the system. More revealing is that (52%,-) said there 
were not any documented requirements and (43%,-) indicated that pro- 
totypes or similar tools were used for requirements. 

(40%,-) had medium difficulty in generating requirements while (35%,-) 
said they were hard and (25%,-) said they were easy. (58%,-) of devel- 
opers had a high level of contact with experts during development. 

Development Information 

The most frequent (40%,-) Life-Cycle model used is the Cyclic Model 
(repetition of Requirements, Design, Rule Generation, and Prototyping 
until done); however, (22%,-) of the respondents stated that no model 
was followed. Most development was done with an Expert System shell 
(CLIPS and others), and the predominant Interface Code was C and 
LISP. Applications were reasonably large, requiring an average of 33 
person/months to develop. Developed systems were not reported to be 
particularly sensitive to change; (77%,-) said changes only occasionally 
caused an unexpected behavior. 

V&V Activities Performed 

Most V&V activities relied on comparison with expected results and 
expert checking. Typically, (24%,-) of the development effort was spent 
on V&V. While developers seemed to feel V&V was of medium diffi- 
culty, users unanimously agreed that it was hard; (34%, 0%) said it was 
medium while (27%, 100%) said it was hard and (33%, 0%) said it was 
easy; (5% ,0%) said it was impossible. Of significant interest is the fact 
that each V&V technique was used as the sole V&V technique in at least 
one project. Also, in general, there was wide ranging uses of V&V tech- 
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niques. (39%, 20%) of the respondents indicated that the ES was a pro- 
totype system. 

V&V Issues Encountered 

The known issues most often cited as problems were: test coverage 
determination (50%, 75%), knowledge validation (44%, 75%), problem 
complexity (39%, 40%), and real-time performance analysis (40%, 25%). 
(Note that as a whole, the developers ranking of the issues agreed with 
the users ranking of the issues). The least cited problem was analysis of 
certainty factors (only seven respondents indicated that certainty factors 
were used). Every known issue was cited by at least one respondent. 

Configuration management practices are reported to be an issue for 
many participants, regardless of whether the system was operational or a 
prototype. 

The expected system use varied widely (3-2000), while actual system use 
was relatively good (less than half of the respondents provided informa- 
tion, suggesting that actual use was much lower than reported). 

The following sections list the results from each individual question. The total 
number of responses is given for each question along with the number of times each 
choice was selected ( given to the left of the choice). 


General information 

The questions for the name of the ES, and the short description are not reported. 

Field of the Problem 

Question Numbers: 5, 5 
Total Responses: 70 

What field does the problem belong to? 

35 Aerospace 
_4 Financial 
_2 Information Systems 
' ’ _8 Hardware 
_6 Manufacturing 
_2 Marketing 

Medical 

_1 Personnel 
_2 Research 
_1 Service 
_4 Softw are 
_5 Other 

Type of Problem Solved 

Question Numbers: 6, 6 
Total Responses: 70 

Which of the following items best describes the kind of problem the Expert System 
addresses? Please indicate primary purpose with a and check all other applicable 
purposes (if any). 

Note: The number of times the choice was selected as primary purpose is given in 
parentheses after the number of times the choice was selected. 
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13 (1 1) Design - Configuring objects under constraints 
1 1 (_0) Repair - Executing plans to administer prescribed remedies 
1 1 (_5) Control - Governing overall system behavior 
16 (_5) Planning - Designing actions 

34 (23) Diagnosis - Inferring system malfunctions from observables 

11 (_I) Debugging - Prescribing remedies for malfunctions 

16 C_3) Prediction - Inferring likely consequences of given situations 
23 (__8) Monitoring - Comparing observations to' expected outcomes 

12 (_1) Instruction - Diagnosing, debugging, and repairing behavior 

15 (_5) Interpretation - Inferring situation descriptions from sensor data 
_5 (_2) Classification - Categorizing objects by properties 
_3 ( ) Others 


Role on Project 

Question Numbers: 2, 2 
Total Responses: 70 

Were you a developer of the Expert System the manager of the, development organ- 
ization, a user of the Expert System, or the manager of a department which uses the 
Expert System? 

42 Developer of Expert System 

_6 Manager of Expert System development organization 
17 Other Development 
_4 User of the Expert System 

_ Manager of a department using the Expert System 
1 Other User 


Performance Criteria 


Performance of the Experts 

Question Numbers: 10,9 
Total Responses: 70 

If human experts currently perform (or previously performed) the task, how often is 
the expert(s) expected to give the correct answer? 

_2 Task not performed by human 
~\1 "Correct" defined by expert 
19 > 99% 

16 95% to 99% 

4 90% to 95% 

~4 80% to 90% 

1 60% to 80% 

_ 40% to 60% 

_4 Other (2 - 100%) 

3 I don't know 
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Expected Performance of the System 

Question Numbers: 11,10 
Total Responses: 70 

How often is the Expert System expected to provide the correct answer? 

22 100% 

16 > 99% 

_9 95% to 99% 

10 90% to 95% 

_4 80% to 90% 

_3 60% to 80% 

_ 40% to 60% 

_1 Other 

_5 I don't know 

Actual Performance of the System 

Question Numbers: 12, 11 
Total Responses: 68 

What is your estimate of how often the Expert System actually provides the correct 
answer? 

11 100 % 

11 > 99% 

12 95% to 99% 

10 90% to 95% 

8 80% to 90% 

~J 60% to 80% 

_1 40% to 60% 

_3 Other ( < 40%) 

_7 I don't know 

Expected Problem Space Coverage 

Question Numbers: 8, 7 
Total Responses: 70 

How much of the problem space is the Expert System expected to cover? 

15 100% 

12 > 99% 

_6 95% to 99% 

_7 90% to 95% 

13 80% to 90% 

_4 60% to 80% 

_4 40% to 60% 

_4 Other 

5 I don't know 


Actual Problem Space Coverage 

Question Numbers: 9, 8 
Total Responses: 70 

WTiat is your estimate of the problem space coverage actually provided by the 
Expert System? 

4 100% 
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_3 > 99% 

_8 95% to 99% 
_3 90% to 95% 

14 80% to 90% 

19 60% to 80% 
_8 40% to 60% 
_7 Other (1 - 5%) 
8 I don't know 


Requirements Definition 

Requirements Format 

Question Numbers: 13, - 
Total Responses: 62 

What was the basis for determining how the system was to behave? Please indicate 
the primary basis with a and check all other applicable basis (if any). 

Note: The number of times the choice was selected as primary basis is given in 
parentheses after the number of times the choice was selected. 

12 (_4) A pre-existing document 

19 (_4) A requirements document completed as part of development. 

_6 ( ) Some other developed document 

27 (_4) A prototype of the system 
49 (38) Expert consultation 
_ 6 (_) 

Requirements Difficulty 

Question Numbers: 14, - 
Total Responses: 63 

How difficult was it to develop the original concept of what the system was sup- 
posed to do? 

_7 Trivial 
15 Easy 
25 Medium 
15 Hard 
_1 Impossible 

Availability of the Expert(s) 

Question Numbers: 17, - 
Total Responses: 53 

If the system was not developed by the expert, how much interaction was there 
between the expert(s) and the development team? 

_6 System was developed by expert 
10 Constant 
15 Frequent 
17 Regular 
_5 Occasional 
None 
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Number of Experts 

Question Numbers: 15, - 
Total Responses: 64 

Was more than one expert consulted during the development of the system? 

10 System was developed by expert 
_6 Single expert 

30 Multiple experts with lead 
12 Committee of experts 
_6 Other 

Agreement Among Experts 

Question Numbers: 16, 21 
Total Responses: 61 

If more than one expert was available for consulting, how often did the experts 
agree on what results the Expert System was supposed to provide? 

_6 A single expert was involved 

1 1 .Always agree 

44 Agree 75% of the time (range 30%-99%) 


Expert in User Organization 

Question Numbers: 12 

Total Responses: 5 

Was the expert(s) a member of the user organization? 

_5 Yes 
No 

User organization provided some expertise 

Developers in User Organization 

Question Numbers: 18, 13 
Total Responses: 69 

Was the developer(s) of the Expert System part of the user organization? 

25 Yes 
31 No 

13 Some development provided by user organization 


Development Information 

Development Life-Cycle Used 

Question Numbers: 19, - 
Total Responses: 58 

Please indicate which development model was used for developing the Expert 
System. 

_5 Requirements gathering preceded Design, Implementation, and Test (Tradi- 
tional waterfall life-cycle). 

12 Requirements gathered before development of a prototype. A second 
requirements activity preceded Design, Implementation, and Test. 
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25 Repetition of the Requirements, Design, Rule Generation, and Prototyping 
phases until production system (final prototype) was developed. 

14 No effort was made to follow a particular model. 

_2 Other 

Languages and Tools Used 

Question Numbers: 20, - 
Total Responses: 64 

What was the primary language/tool for the knowledge structures ? 

Note: The most frequent languages/tools are reported after the choice as: “fre- 
quency - language/tool.” 

Knowledge Structures (17 - ESE, 13 - CLIPS, 10 - LISP, others) 


Size of the System 

Question Numbers: 22, - 
Total Responses: 39 

Since Knowledge Bases can be written using several type of Knowledge Structures, 
please indicate how many of the following structures were used. If another type of 
structure was used, please describe it and how many were used. 

Note: The number of times that a value was given for each choice is provided in 
parentheses followed by the average value for that response. The range of the 
responses is given in parentheses after each choice. 

(35) 235 Rules (range 30-1000) 

(15) 872 Frames (range 1-10000) 

(10) 248 Facts (range 50-800) 

(15) 121 Parameters (range 20-400) 

( 2) 8K Statements (2K - 16K) 

Total Development Effort 

Question Numbers: 29, - 
Total Responses: 57 

How much effort was expended in developing tHe system, including evaluation 
activities performed by the developers? 33 (range 1-200) person/months. 

Detailed Development Effort 

Question Numbers: 21, - 
Total Responses: 64 

What percentage of the total development effort was dedicated to each part of the 
Expert System? 

61 % Knowledge Structures ^ 

8 % Inference Engine 
31 % Interface Code 
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System Sensitivity 

Question Numbers: 24, - 
Total Responses: 64 

When changes were made to the knowledge structures, how often did some unex- 
pected result occur? 

_5 Never 
44 Occasionally 
_9 Frequently 
_5 Usually 
_1 Always 


V&V Activities Performed 


V&V Activities during development 

Question Numbers: 28, - 
Total Responses: 63 

What testing activities were performed on the executing system? (indicate any that 
apply) 

_2 No evaluation was performed 
38 Checked by expert(s) 

32 Compared with expected results 
28 Structural testing (e.g. cover all rules) 

18 Other 


V&V Activities after development 

Question Numbers: 33, 20 
Total Responses: 47 


What testing activities were performed on the executing system before the system 
was delivered to the users? (indicate any that apply) 


_1 No evaluation was performed 
33 Checked by expert(s) 

39 Compared with expected results 
29 User acceptance 
16 System run in parallel 
5 Other 


Development effort was spent on V&V 

Question Numbers: 30, - 
Total Responses: 62 


How much of the development effort was spent on evaluation? 24 % (range 
2%-80%) 
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V&V of Knowledge Structures 

Question Numbers: 25, - 
Total Responses: 65 

What evaluation activities were performed on the Knowledge Structures? (indicate 
any that apply) 

_3 No evaluation was performed 
28 Desk checking 
15 Formal inspections 
42 Checked by expert(s) 

39 Structural testing (e.g. cover all rules) 

_9 Other 

V&V of Inference Engine 

Question Numbers: 26, - 
Total Responses: 35 

What evaluation activities were performed on the Inference Engine? (indicate any 
that apply) 

17 No evaluation was performed (ES shell was used) 

_2 No evaluation was performed 
_3 Desk checking 
10 Formal inspections 
_5 Structural testing 
Other 


V&V of Interface Code 

Question Numbers: 27, - 

Total Responses: 58 ' : v TT : r — 

What evaluation activities were performed on the Interface Code? (indicate any that 
apply) 

_7 No evaluation was performed 
25 Desk checking 
12 Formal inspections _ 

29 Structural testing (branch or path) 

18 Experts 
Other 

Difficulty of V&V 

Question Numbers: 35, 22 
Total Responses: 67 

Compared to conventional software testing efforts, how difficult was the evaluation 
of the Expert System? -\- : - 

_3 Trivial 
16 Easy 
20 Medium 
20 Hard 
_3 Impossible 
4 No evaluation was done 
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Separate V&V group 

Question Numbers: 31, - 
Total Responses: 62 

Did a separate organization evaluate the Expert System before it was delivered to 
the users? 

15 Yes, there was a separate evaluation organization. 

47 No, there was not a separate evaluation organization. 

Independent V&V Effort 

Question Numbers: 32, 19 
Total Responses: 1 1 

If there was a separate evaluation team, how much effort was expended by the team 
in evaluating the correctness of the Expert System? 

(1 1) 3 (range 1-7) person/months reported by developers 
(3) 16 (range 3-24) person/months reported by users 

Operational or Prototype System 

Question Numbers: 3, 3 
Total Responses: 70 

Is the Expert System operational or is it a prototype? 

42 Operational system 
25 Prototype system 
_3 Operational prototype (write in) 


System Criticality 

Question Numbers: 37, 15 
Total Responses: 69 

How reliable is the Expert System required to be? 

_7 Trusted with human life 
15 Trusted with mission objectives 
3 1 As reliable as the expert 
17 Assists the expert 
19 Assists the user 
Other 


V&V Issues Encountered 






Known Issues Actually Encountered 

Question Numbers: 36, 23 
Total Responses: 66 


Many people feel that some development issues are more of a problem with Expert 
Systems than with conventional systems. Which (if any) of the following were 
problems during implementation or test of this Expert System? 

13 Understandability and readability of knowledge structures 
34 Determining test coverage for knowledge structures 
19 Modularity; Design of knowledge structures ■ 
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30 Knowledge validation 
_6 Analysis of Certainty Factors 
_8 Validating the inference engine 
26 Real-time performance analysis 
26 Complexity of the Problem 
14 Certification 

_9 Configuration Management 
6 Other 


Certainty Factors 

Question Numbers: 7, - 
Total Responses: 64 

Does the Expert System include certainty factors? 

J Yes 
54 No 

I don't know 

Configuration Management 

Question Numbers: 34, - 
Total Responses: 45 

How were changes to the Expert System distributed to the users? 

_5 User updated system at developer's direction 
1 8 Developers made changes to users' system 
_1 Untested system distributed to users 
22 Tested system distributed to the users 
_3 Configuration management group distributes system 
1 Other 


Expertise Implementation Difficulty 

Question Numbers: 23, - 
Total Responses: 62 

Aside from any difficulties in developing the original concept, how difficult was it to 
express the behavior (through the Knowledge Structures) of the expert? 

_3 Trivial 
16 Easy 
20 Medium 
20 Hard 
_3 Impossible 


Expected System Use 

Question Numbers: 39, 17 

Total Responses: 50 

How many people are expected to make use of the Expert System? 219 (range 
1 - 2000 ) 
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Perceived System Reliability 

Question Numbers: 38, 16 
Total Responses: 68 

Does the Expert System seem to be more reliable or less reliable than conventional 
systems that are in use? 

_9 Significantly more reliable 
16 More reliable 
_3 Slightly more reliable 
19 Similar reliability 
_2 Slightly less reliable 
_1 Less reliable 
_ Significantly less reliable 
14 No comparison is available 
_4 I don't know 

User Trust 

Question Numbers: 14 

Total Responses: 5 

Why do you believe the results that the system gives? 

_1 Expert says it is correct 
_3 Participated in evaluation 

Someone I trust did evaluation 

_5 Personal use and checking 
_1 User acceptance 

I don't trust the results 

Other 
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Summary of Interview Results 

In addition to acquiring written responses to the survey questions, interviews were 
performed to gather additional data and to clarify questions concerning the written 
responses. Additional information from these interviews are summarized in this 
section. 

Structural Testing: Based on the survey results, a commonly used evaluation 
approach was the use of structural testing. This was surprising because it was felt 
that structural testing was relatively difficult to apply to expert systems. From the 
interviews, we learned that although some projects did attempt to measure the 
actual test coverage (i.e., percentage of rules executed during testing) many others 
did not actually measure the coverage. Instead, they attempted to develop test cases 
that would cover all of the knowledge base (or at least the important parts) but 
made no attempt to measure how well the knowledge base was actually covered. 
.Also, there appeared to be no attempt to cover interactions between knowledge base 
elements (e.g., rule interactions); each element was tested as if it were an inde- 
pendent piece of the knowledge base. Some knowledge base developers felt that 
more formal structural testing would be too much effort and would hinder the 
development process too much. In conclusion, it seemed that, although structural 
testing was used, it was a very weak form of structural testing (at least compared to, 
say, branch coverage in procedural software testing). 

Experts Developing Expert Systems: It appeared that the expert was heavily relied 
upon to aid in evaluation of the knowledge base; this subject was probed more 
deeply during the interviews. It seems that a close interaction between the expert 
and the knowledge base developer was mandatory to successfully develop an expert 
system. This is not a surprising result and it has been discussed at length in the liter- 
ature. However, it was surprising to learn that many knowledge base developers feel 
that this interaction is so important that they think the best approach is simply to 
have the expert develop the system. However, one non-programmer interviewee, 
who felt that his group was being successful at having experts develop their own 
systems, also thought that this approach would have to altered to some extent in 
order to be successful at the more sophisticated types of expert systems that they 
would be developing in the future. 

Requirements Writing and the Conventional Software Life-Cycle: It was antic- 
ipated that expert systems were being developed using a much more iterative and 
less structured life-cycle than the conventional and rigid waterfall model. And, 
although the subject of life-cycle models was not intentionally addressed during the 
interviews, it often came up when discussing requirements. It seems that several 
respondents associated “requirements” with the conventional waterfall model and 
they felt very strongly that the conventional approaches to software development, 
such as the waterfall model, were much too formal and structured for expert systems 
development - that is, it would be disastrous to apply them to expert systems, 
Though for some, this feeling extended to requirements, others simply used a dif- 
ferent approach to requirements. For example, in some cases, requirements were not 
written because it was felt that a requirements document was a formally written 
paper document that needed to be “approved” before development could proceed. 
While in other cases, an iterative prototyping development effort took place and was 
followed by documenting system requirements; these requirements were then used 
to test the system to ensure that it worked as everyone thought it (supposedly) did. 
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Prototypes vs. Operational Systems: Although we attempted to get respondents 
to state that their system was either "a prototype” or “operational,” we received 
indications that this distinction was not easy to make, in practice. For example, 
responses included “it is both a prototype and operational,” or “it is an operational 
prototype,” or “it is just a prototype but we have many users,” It seems that some 
systems are originally intended to be a prototype but become used operationally. 
Some intentionally approach the development of an operational system by first 
developing a “prototype” and once the prototype is “certified,” it is considered 
“operational.” However, there is a danger that a prototype will be used as if it were 
operational. Some have made efforts to ensure that a system that was only intended 
to be a prototype system was not accidentally relied upon in an operational setting. 

Real-Time Performance Analysis: It was intended that “real-time performance 
analysis” would refer to the ability to predect the response time for an expert 
system. That is, the ability to analyze the time performance of the system. However, 
from the interviews we learned that many interpreted “real-time performance anah 
y sis” to mean the ability to get the system to run as fast as desired/necessary. 

Issues Independent of A System Being an Expert System 

An important, but difficult, aspect of analyzing expert system development method- 
ology is distinguishing properties of expert systems that are significantly different 
from properties of conventional software. This is also an important aspect of the 
analysis of this survey of V&V issues. Several comments appeared to be due more 
to factors other than the fact that the system being developed was an “expert” 
system. The interviews helped clarify this issue which the remainder of this section 
discusses. 

Extensive Use of Prototyping and Rapid Development: The conventional 
waterfall life-cycle model has proven to be ineffective for conventional software 
development so it is no surprise that developers do not want to use it for expert 
system development. A more iterative model (e.g., the sprial model) that includes 
the use of rapid prototyping is being perceived as a better alternative to the waterfall 
model. “Conventional” software development project often include the use of proto- 
typing, developing better user interfaces, having more user involvement during devel- 
opment, or having developers better understand the problem domain; these are not 
issues or approaches that are unique to expert system development. 

Small/Simple vs. Large/Complex Systems: Although some of the systems sur- 
veyed are fairly large (e.g., 200 personmonths), they are generally much smaller than 
dedicated software development projects (e.g., Shuttle MCC, Shuttle flight software, 
etc.). The systems surveyed seem to be isolated efforts to develop off-line applica- 
tions for niches for which expert system technology was felt to be very suitable. 

That is, they were not systems that are not a part of larger software system; though 
they are often used in conjuction with a large data processing system (e.g., they 
receive real-time data from a large data processing system). This allowed the expert 
system developers to work without many of the constraints imposed on larger 
systems (e.g., tightly controlled configuration mangagement). 

Addressing a Knowledge Engineer Instead of a Programmer: Although we did 
not intend to gather information on the experience and background of individual 
expert system developers, we did learn that several respondants involved in devel- 
oping expert systems are experts in a problem domain and do not have much pro- 
gramming experience. This fact will be important when considering 
recommendations (see “Recommendations” on page 23); that is, the recommen- 
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dations should not assume first-hand knowledge of conventional software V&V 
techniques. 

Summary: It may be the case that the above issues are indeed typical of expert 
system development projects and that they should be addressed when addressing 
V&V of expert system problems. However, it should be recognized that they are 
somewhat different than the other issues that are true of ail expert systems regardless 
of their size and who is developing them. This may point to a need to tailor sug- 
gestions for V&V of expert systems to considerations such as the size of the expert 
system, the experience of the developer, whether the system is embedded in a much 
larger software system, etc. 
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Recommendations 


The recommendations from the survey results are separated into two categories: 

Direct Recommendations 

Recommendations in this category are directly supported by the survey 
results. These recommendations include: 

• Develop Requirements for Expert System Verification and Vali- 
dation 

• Address Most Often Encountered Issues 

• Recommend a Life Cycle for Expert Systems Development 

Inferred Recommendations 

Recommendations in this category can be inferred from the survey 
results by analyzing relationships among the responses. These recomm- 
endations include: 

• Address Readability and Modularity Issues 

• Address Configuration Management Issue 

• Develop Criteria to Classify Expert Systems by Intended Use 

• Investigate Applicability of Analysis Tools 

Following each general recommendation is an explanation of what was observed in 
the survey results. After this explanation is a list of specific recommendations which 
address all the observations. Each specific recommendation in the “Direct Rec- 
ommendations” section is followed by a list of supporting phrases from “Summary 
of Results” on page 8. 


Direct Recommendations 

Develop Requirements for Expert System Verification and Validation 

The major goal of this survey task was to discover and document the current state 
of the practice in Verification and Validation of Expert Systems. Based on the 
survey results, it appears that much can be done to improve the practice. The lack 
of requirements for performing V&V on ESs was manifested in several forms: 

• The V&V activities performed were very inconsistent, ranging from none to very 
many, and the sets of activities performed were very diverse. 

• The reliance on expert consultation as the only source of requirements was 
extremely high. 

• The reliance on experts to perform V&V activities on the knowledge base, inter- 
face code, and executing systems was very high. 

• The low performance levels for many of the expert systems was surprising. 
Although it is not known what is acceptable reliability for the systems that were 
surveyed, often the estimated actual reliability was less than the expected reli- 
ability. Also, it is unlikely that conventional software systems that exhibited a 
similar level of performance would gain wide acceptance. (For example, many 
reported that the ES provides the correct answer less than 90 % of the time. 
Most conventional software reliability is rated as a series of '9's, e.g., 4 '9's 
means the correct answer is given > 99.99 % of the time.) 
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• In those cases where the expected behavior of the system was not strictly 
defined by expert consultation, a large number of systems relied on prototypes. 
This is significant because prototype systems receive less V&V than operational 
systems, but are then used to define the behavior of operational systems. 

Each of the above observations can be directly attributed to three factors: 

1. There is a general lack of understanding on how to V&V ESs. The wide 
ranging use of V&V approaches (e.g., each technique being used as the sole 
technique by at least one project) indicates that there is no clear approach to 
V&V. ThaUs^ it js not known what V&V activities are to be performed, when 
the activities should be performed, or how the activities can be accomplished. 
This could, in part, be due to the software experience level of some of the devel- 
opers. 

2. There is little understanding of how requirements for an ES should be generated 
and documented. It could be argued that this is a development issue, but 
without documented expected behavior, there is no possibility of performing 
adequate V&V. 

3. A large number of expert systems are prototypes for which V&V receives little 
consideration. 

Recommendations 

1. Develop recommendations and/or g uideline s for Verification and Validation of 
Expert Systems. (Since such a significant amount of research has been devoted 
to V&V of traditional software, it may be appropriate to approach this task as a 
set of modifications to current conventional software V&V requirements.) 

These guidelines should include the ability for customization based on system 
size, developer software experience, whether it is stand-alone or a part of a 
much larger system, etc. 

“75% of the respondents indicated that expert consultation was a basis for 
determining the behavior of the system.” 

“Most V&V activities relied on comparison with expected results and expert 
checking” 

"In most cases, there was not a separate group to perform V&V” 

2. Initial efforts to define V&V requirements should be focused on diagnostic 
systems, since a large majority of the systems surveyed performed diagnostic ser- 
vices. 

“Most ... perform Diagnosis (45%, 80) ...” 

3. Research the process of converting p rototype ESs into operational systems. A 
large number of respondents indicated that they were either building prototypes 
for later conversion into operational systems, or building operational systems 
based on prototypes. 

“43% of respondents indicated that prototypes or similar tools were used 
for the requirements” 

“39% of the respondents indicated that the ES was a prototype system.” 


Recommendations 24 


Updated Survey Report 


Address Most Often Encountered Issues 

All of the known issues with performing V&V on Expert Systems were cited at least 
once in the survey. A small group of issues, however, were cited significantly more 
often than others and included: 

1. Detenriining test coverage, 

2. Knowledge validation, 

3. Real-time performance analysis 

4. Complexity of the problem 

The first two issues are well understood and are active research areas. These 
research areas should be matured so that they solutions to these issues can be pro- 
vided. 

The issue of real-time performance analysis was briefly discussed earlier (see 
“Summary of Interview Results” on page 20). Since this issue may most often be 
interpreted as the inability to get the expert system to run fast enough, and this is 
not a V&V issue, it is not clear that any recommended action is needed. However, 
it did appear from the descriptions of the expert systems, that the ability to predict 
the response time of the system should not be a major issue for current expert 
systems so it is not felt that any recommendation is needed at this time. 

The complexity issue is not as well understood. These is considerable opinion that 
the types of problems addressed by ESs are significantly harder than the problems 
addressed by conventional software. Others maintain the apparent difficulty is attri- 
buted to the lack of requirements (see above). In either case, there does not seem to 
be a way to approach the complexity issue without considering it in the context of 
the readability and modularity issues, as done in “Address Readability and Modu- 
larity Issues” on page 26. 

Recommendations 

1. Develop tools and/or methods to support the determination of test coverage. 

“The known issues most often cited as problems were: test coverage deter- 
mination (50%, 75%) ...” 

2. Develop methods and/or tools to support the knowledge validation activity. 

“The known issues most often cited as problems were: ... knowledge vali- 
dation (44%, 75%) ...” 

3. Develop methods and/or tools to assist in managing problem complexity. 

“The known issues most often cited as problems were: ... problem com- 
plexity (39%, 40%) ...” 

Recommend a Life Cycle for Expert Systems Development 

The most common Life Cycle applied to the development of the ESs included in 
this survey was the Cyclic model. In the Cyclic model, the stages of requirements, 
design, knowledge base development, and test are repeated until the final system is 
developed. The testing activities at the end of each cycle (except the last) lead to the 
refinement of the requirements that will be used in the successive cycle. Several var- 
iations, including some with a fixed number of cycles, have been proposed. 

A large number of respondents, however, indicated that no attempt w f as made to 
follow any model. If no model is being followed, there is little opportunity to apply 
V&V activities at the appropriate points during development. Clearly, any life cycle 
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guidelines would be of benefit in these situations. Multiple life-cycle approaches, or 
a single very flexible life-cycle should be recommended. 

Recommendation 

1. Multiple life cycle models, or a single, very flexible life cycle model should be 
recommended for development of ESs. (The high incidence of prototypes 
leading to operational systems suggests that the cyclic model should be recom- 
mended. Rapid prototyping could be treated as a special case of the cyclic 
model.) 

“The most frequent (40%) Life-Cycle model used is the Cyclic Model ... 
however, 22% ... stated that no model was followed.” 

“43%. respondents indicated that prototypes or similar tools were used for 
the requirements’' 

“(39%,20%) of the respondents indicated that the ES was a prototype 
system.” 


Inferred Recommendations 

Address Readability and Modularity Issues 

Readability and modularity were expected to be significant issues, but were not the 
most frequently cited problems. Further analysis of the survey results indicate that 
the readability and modularity issues may have been reported as other problems. 
This analysis includes the following observations: 

• As often as not, people chose modularity or readability as problems, but not 
both. This seems to indicate that many respondents do not see the relationship 
between the two. 

• Similarly, as often as not, people picked test coverage determination without 
picking modularity, so the apparent relationship between there two issues was 
not established. 

• The lack of reported relationships between the readability, modularity, and test 
coverage issues is very confusing, implying, for instance, that a rule can be 
understood but a test scenario for it can not be developed. 

• Readability and complexity of the problem were very rarely chosen together. 
That is, the developer recognizes that the ES was complicated but attributed this 
complexity either to the problem or to the solution, but not both. It is ques- 
tionable that the complexity of the problem and the complexity of the solution 
can be easily distinguished. (The emergence of Object-oriented programming 
languages is due, in part, to the claim that conventional languages cause pro- 
gramming complexities which are erroneously attributed to problem com- 
plexity.) 

If the number of times each of these issues were reported are added together, the 
collection of issues becomes a very frequently cited problem. Since these issues are 
so closely interrelated, they should be addressed as a single issue. Therefore, the 
problem of reducing overall complexity (problem/ solution) is a very important issue. 

Recommendation 

1. Develop methods and/or tools to support the readability, modularity, and 
problem complexity issue. 
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Address Configuration Management Issue 

Configuration management was an infrequently cited problem. However, the survey 
results also show that in practice the applied CM, while sometimes quite good, was 
generally poor (changes to the knowledge base were not well managed). This con- 
tradiction is probably due to the high frequency of prototypes and "in development" 
responses to the survey. While there are certain applications for which CM may 
never be a significant issue, certainly there are applications for which CM is a very 
important issue. 

Recommendation 

1. Identify the differences between CM of conventional software systems and CM 
of expert systems. It is not immediately obvious that there are differences. 

Develop Criteria to Classify Expert Systems by Intended Use 

The survey results indicate that there is a very diverse set of applications which are 
utilizing ES technology. At least the following types of applications exist: 

Expert Clone 

Provides expert assistance to a human user. The expert is usually avail- 
able if the ES does not provide the correct results. The major uses of 
this type of include: education and capture of true institutional know- 
ledge. 

Expert Assistant 

.Allows the user, typically an expert, to concentrate on the more impor- 
tant aspects of the task. These ESs typically serve as filtering mech- 
anisms. 

Autonomous 

Limited supervision is applied to the ES. In additional to providing fil- 
tering, these systems typically develop and execute plans to handle situ- 
ations. 

A subcategory of Autonomous ESs are time critical ESs. These ESs 
exist primarily because experts can not interpret data efficiently enough 
to perform the task in the allotted time. 

Self-modifying autonomous 

Part of the planned execution is to modify its knowledge base to respond 
to certain situational data. The application of V&V to this type of 
problem is currently uncertain. 

Traditional Software Problem 

Some conventional problems (e.g. discrete event simulation), are more 
conveniently implemented using expert system shells 

It is apparent that because of this diversity, a single set of V&V requirements is 
probably undesirable. Development of classification criteria allows a simplification 
of ES V&V requirements. In addition to simplification, classification allows the 
development of requirements to be concentrated on the types of applications of 
interest. 

Recommendations 

1 . Develop classification criteria to distinguish among expert systems which require 
different V&V approaches. 
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2. Concentrate initial V&V requirements definition effort on autonomous systems, 
since these systems are likely the most critical. 

Investigate Applicability of Analysis Tools 

A very large number of respondents indicated that experts were the primary source 
of requirements and verification. Several of the previous recommendations would 
reduce this dependence, but there is a class of expert system applications for which 
expert consultation will continue to be the leading source. 

Recommendations 

1. Determine if a there is a communication problem between the experts and the 
knowledge engineers / expert system developers. 

2. If a communication problem exists, investigate the possibility of representing 
Knowledge Base in a form that domain experts can easily, yet accurately, under- 
stand. 
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Appendix A. Detailed results 

The following table represents the raw data from the survey of expert system devel- 
opers. Except for questions number 1 and 4 1 there is a column in the table for each 
question in the survey. The column headers have a number in parentheses corre- 
sponding to the question number in the survey. There is also a short mnemonic 
representing the subject of the question to facilitate cross reference to the correct 
survey question. 


Summary of Developers Responses (part 1) 


1 Answers to questions 1 and 4 are not provided because these would identify survey respondent. 
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Updated Survey Report 


Appendix B. 
(Developer) 


Instructions 


Expert Systems Evaluation Questionnaire 


By filling out this NASA funded questionnaire, you can help define the state-of-the- 
practice in the formal evaluation of Expert Systems on current NASA and industry 
applications. The information that you provide will be merged with the information 
from all other surveyed projects for the purpose of recommending future research 
and development activities. Individual responses are used solely as input to this 
information merging process. Each survey participant will be sent a copy of the 
final survey results. 

Expert System applications are becoming more prevalent in fields where proper 
functioning is essential, such as the aerospace, medical, and financial industries. It is 
widely claimed that Expert Systems are not as rigorously evaluated as traditional 
software because of unique, unresolved evaluation issues. To ensure the continued 
and safe deployment of Expert Systems into critical areas, adequate evaluation tech- 
niques which address these issues must be developed and performed. 


The following questions concern your experiences with an Expert System, either as 
a developer or as the manager of the development effort. Feel free to indicate your 
answers in any way you like. Some of the choices on the multiple choice questions 
have places to fill in additional information; please indicate the choice and include 
the additional information, if possible. If you have any comments about the 
questions or your answers, please write them in the left margin. 

Analysis of the responses may indicate that further discussion is required for com- 
plete understanding of the issues encountered during the evaluation process. Dis- 
cussions will be held either as short one-on-one meetings or by telephone. Would 
you be available, at your convenience, to discuss the evaluation process in more 
detail? 

Yes lam available for discussions. 

Name 

Phone 

No I am not available for discussions. 

If you have any questions regarding this questionnaire, please contact Keith Kelley 
at (713) 282-7303. If possible, please return completed questionnaires within one 
week of receipt to: ^ , 

Keith Kelley 
MC 6606 

IBM Federal Sector Division 
3700 Bay Area Blvd. 

Houston, Tx. 77058-1199 
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& 




EM 


a 


Definitions 

Certainty factors 

Some problems require the use of certainty factors (also called probabili- 
ties, or fuzzy logic) in their processing. Facts which contain certainty 
factors have the form: “if a is true, then there is an x% chance that b is 
true.” 

Expert 

The person who provides the knowledge that is to be captured in the 
Expert System. 

Inference engine 

Processes the knowledge structures to infer a set of output facts from a 
set of input facts. Examples of commercial systems are CLIPS and 
ESE. 

Interface code 

Used to supplement the inference process. Examples are interfacing the 
inference engine to a device, and performing arithmetic calculations. 

Knowledge structures 

Declarative part of the Expert System which represents the knowledge 
(typically called the Knowledge Base). Examples are frames and rules. 

Problem space 

The total number of cases which could potentially be addressed by the 
Expert System. 

Problem space coverage 

The percentage of the problem space that is addressed by the Expert 
System. For example, if the Expert System is supposed to be able to 
diagnose 100 malfunctions, but the total number of malfunctions is 
known to be 200, the problem space coverage is 50%. 

Questions 

1. What is the name of the Expert System you were/are involved with? 


2. Were you a developer of the Expert System or the manager of the develop- 
ment organization? 

a. Developer of Expert System 

b. Manager of Expert System development organization 

c. Other 


3. Is the Expert System operational or is it a prototype? 

a. Operational system b. Prototype system 

4. Briefly describe what the expert system does. 


E 
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5. What field does the problem belong to? - - _ s=i 


a. 

Aerospace 

g- 

Medical 


b. 

Financial 

h. 

Personnel 


c. 

Information Systems 

i. 

Research 


d. 

Hardware 

j- 

Service 


e. 

Manufacturing 

k. 

Software 


f. 

Marketing 

1 . 

Other 

L “ 


6. Which of the following items best describes the kind of problem the Expert 
System addresses? Please indicate primary purpose with a and check all 
other applicable purposes (if any). 



Design - Configuring objects under constraints 

Repair - Executing plans to administer prescribed remedies 

Control - Governing overall system behavior 

Planning - Designing actions 

Diagnosis - Inferring system malfunctions from observables 
Debugging - Prescribing remedies for malfunctions 
Prediction - Inferring likely consequences of given situations 
Monitoring - Comparing observations to expected outcomes 
Instruction - Diagnosing, debugging, and repairing behavior 
Interpretation - Inferring situation descriptions from sensor 
Classification • Categorizing objects by properties data 


Does the Expert System include certainty factors? 

=3 

a. Yes 

c. I don't know 

i 

b. No 

How much of the problem space 

is the Expert System expected to cover? 

1 IB 

a. 100% 

f. 60% to 80% 


b. > 99% 

g. 40% to 60% 


c. 95% to 99% 

h. Other % 

m 

d, 90% to 95% 

i. I don't know 


e. 80% to 90% 




9. What is your estimate of the problem space coverage actually provided by the 
Expert System? 


a. 

Same as expected 

f. 

80% to 90% 

m 

b. 

100% 

g- 

60% to 80% 


c. 

> 99% 

h. 

40% to 60% 

i j 

d. 

95% to 99% 

i. 

Other 

_% g 

e. 

90% to 95% 

j- 

I don't know 



m 

y 
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Questions 10 through 12 are concerned with the percentage of problems within the 
problem space (covered by the Expert System) that are answered correctly. 

10. If human experts currently perform (or previously performed) the task, how 
often is the expert(s) expected to give the correct answer? 

a. Task not performed by human f. 80% to 90% 

b. "Correct" defined by expert g. 60% to 80% 

c. > 99% h. 40% to 60% 

d. 95% to 99% i. Other % 

e. 90% to 95% j. I don't know 


1 1. How often is the Expert System expected to provide the correct answer? 


, ... 


a. 

100% f. 

60% to 80% 



b. 

> 99% g. 

40% to 60% 



c. 

95% to 99% h. 

Other % 



d. 

90% to 95% i. 

I don't know 



e. 

80% to 90% 



12. 

What is your estimate of how often the Expert System actually provides the 



correct answer? 




a. 

100% f. 

60% to 80% 

i 


b. 

> 99% g. 

40% to 60% 

UP 


c. 

95% to 99% h. 

Other % 



d. 

90% to 95% i. 

I don't know 



e. 

80% to 90% 



13. 

What was the basis for determining how the system was to behave? Please 
indicate the primary basis with a and check all other applicable basis (if 



any). 




a. 

A pre-existing document 




b. 

A requirements document completed as 

part of development. 



c. 

Some other developed document 




d. 

A prototype of the system 




e. 

Expert consultation 


zrzz 


f. 

Other 


> - i 

14. 

How 

difficult was it to develop the original concept of what the system was 

g 


supposed to do? 




a. 

Trivial d. 

Hard 

m 


b. 

Easy e. 

Impossible 



c. 

Medium 


. — 

15. 

Was 

more than one expert consulted during the development of the system? 



a. 

System was developed by c. 

Multiple experts with lead 




expert ^ 

Committee of experts 

S 


b. 

Single expert 

Other 


fes 
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16. If more than one expert was available for consulting, how often did the experts 
agree on what results the Expert System was supposed to provide? 

a. A single expert was involved c. Agree % of the time. 

b. Always apee 

17. If the system was not developed by the expert, how much interaction was 


there 

between the expert(s) and the development team? 

a. 

System was developed by 

d. 

Regular 


expert 

e. 

Occasioi 

b. 

Constant 

f. 

None 

c. 

Frequent 




18. Was the developer(s) part of the user organization? 

a. Yes c. Some developers were in the 

b. No user organization 

19. Please indicate which development model was used for developing the Expert 

System. : u-. v.. 

a. Requirements gathering preceded Design, Implementation, and Test 
(Traditional waterfall life-cycle). 

b. Requirements gathered before development of a prototype. A second 
requirements activity preceded Design, Implementation, and Test. 

c. Repetition of the Requirements, Design, Rule Generation, and Proto- 
typing phases until production system (final prototype) was developed. 

d. No effort was made to follow a particular model. 

e. Other 


20. What was the primary language/tool for each part of the Expert System? 

a. Knowledge Structures 

b. Inference Engine 

c. Interface Code 


21. What percentage of the total development effort was dedicated to each part of 
the Expert System? 


a. 

Knowledge Structures _ 

% 

b. 

Inference Enrine 
value should be 0%.) 

% (If an Expert System Shell was used, this 

c. 

Interface Code 

_% 
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22. Since Knowledge Bases can be written using several type of Knowledge Struc- 
tures, please indicate how many of the following structures were used. If 
another type of structure was used, please describe it and how many were 


used. 




a. Rules 

d. 

Parameters 


b. Frames 

e. 

Statements 


c. Facts 

f. 

Other (#) 

of 


23. Aside from any difficulties in developing the original concept, how difficult was 
it to express the behavior (through the Knowledge Structures) of the expert? 


a. 

Trivial 

d. Hard 

b. 

Easy 

e. Impossible 

c. 

Medium 



24. When changes were made to the knowledge structures, how often did some 
unexpected result occur? 


a. 

Never 

d. Usually 

b. 

Occasionally 

e. Always 

c. 

Frequently 



Questions 25 through 28 are concerned with the evaluation activities performed 
during development. 

25. What evaluation activities were performed on the knowledge Structures? (indi- 
cate any that apply) 


a. 

No evaluation was performed 

d. 

Checked by expert(s) 

b. 

Desk checking 

e. 

Structural testing (e.g. cover all 

c. 

Formal inspections 


rules) 



f. 

Other 


26. What evaluation activities were performed on the Inference Engine? (indicate 
any that apply) 

a. No evaluation was performed d. Structural testing 

b. Desk checking e. Other 

c. Formal inspections 

27. What evaluation activities were performed on the Interface Code? (indicate 
any that apply) 

a. No evaluation was performed d. Structural testing (branch or 

, ^ , . path) 

b. Desk checking 

_ . . e. Other 

c. Formal inspections 
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28. What testing activities were performed 
that apply) 

a. No evaluation was performed 

b. Checked by expert(s) 

c. Compared with expected 
results 


on the executing system? (indicate any 

d. Structural testing (e.g. cover all 
rules) 

e. Other 


29. How much effort was expended in developing the system, including evaluation 
activities performed by the developers? person/months. 


30. How much of the development effort was spent on evaluation? 

%. 

31. Did a separate organization evaluate the Expert System before it was delivered 
to the users? 

a. Yes, there was a separate eval* b. No, there was not a separate 

uation organization. evaluation organization. 

32. If there was a separate evaluation team, how much effort was expended by the 

team in evaluating the correctness of the Expert System? 

person/months. 

33. What testing activities were performed on the executing system before the 
system was delivered to the users? (indicate any that apply) 

a. No evaluation was performed d. User acceptance 

b. Checked by expert(s) e. System run in parallel 

c. Compared with expected f. Other 

results 

34. How were changes to the Expert System distributed to the users? 

a. User updated system at developer's direction 

b. Developers made changes to users' system 

c. Untested system distributed to users 

d. Tested system distributed to the users 

e. Configuration management group distributes system 

f. Other 

35. Compared to conventional software testing efforts, how difficult was the evalu- 


ation 

of the Expert System? 



a. 

Trivial 

d. 

Hard 

b. 

Easy 

e. 

Impossible 

c. 

Medium 

f. 

No evaluation was done 
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Many people feel that some development issues are more of a problem with 
Expert Systems than with conventional systems. Which (if any) of the fol- 
lowing were problems during implementation or test of this Expert System? 

a. Understandability and readability of knowledge structures 

b. Determining test coverage for knowledge structures 

c. Modularity/ Design of knowledge structures 

d. Knowledge validation 

e. Analysis of Certainty Factors 

f. Validating the inference engine 

g. Real-time performance analysis 

h. Complexity of the Problem 

i. Certification 

j. Configuration Management 

k. Other 


How 

reliable is the Expert System required to be? 

a. 

Trusted with human life 

d. 

Assists the expert 

b. 

Trusted with mission objec- 

e. 

Assists the user 


tives 

f. 

Other 

c. 

As reliable as the expert 




Does the Expert System seem to be more reliable or less reliable than conven- 
tional systems that are in use? 


a. 

Significantly more reliable 

f. 

Less reliable 

b. 

More reliable 

g- 

Significantly less reliable 

c. 

Slightly more reliable 

h. 

No comparison is available 

d. 

Similar reliability 

i. 

I don't know 

e. 

Slightly less reliable 




How many people are expected to make use of the Expert System? 


How frequently are the (expected) users actually using the system? (Numbers 
may add up to more than 100% if the actual number of users is greater than 
the expected users.) 

a. % use the system more than expected 

b. % use the system about as much as expected 

c. % use the system less than expected 

d. % do not use the system 
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Appendix C. Expert Systems Evaluation Questionnaire (User) 

By filling out this NASA funded questionnaire, you can help define the state-of-the- 
practice in the formal evaluation of Expert Systems on current NASA and industry 
applications. The information that you provide will be merged with the information 
from all other surveyed projects for the purpose of recommending future research 
and development activities. Individual responses are used solely as input to this 
information merging process. Each survey participant will be sent a copy of the 
final survey results. 

Expert System applications are becoming more prevalent in fields where proper 
functioning is essential, such as the aerospace, medical, and financial industries. It is 
widely claimed that Expert Systems are not as rigorously evaluated as traditional 
software because of unique, unresolved evaluation issues. To ensure the continued 
and safe deployment of Expert Systems into critical areas, adequate evaluation tech- 
niques which address these issues must be developed and performed. 

Instructions 

The following questions concern your experiences with an Expert System, either as 
a user or as the manager of a department that uses Expert System. Feel free to 
indicate your answers in any way you like. Some of the choices on the multiple 
choice questions have places to fill in additional information; please indicate the 
choice and include the additional information, if possible. If you have any com- 
ments about the questions or your answers, please write them in the left margin. 

Analysis of the responses may indicate that further discussion is required for com- 
plete understanding of the issues encountered during the evaluation process. Dis- 
cussions will be held either as short one-on-one meetings or by telephone. Would 
you be available, at your convenience, to discuss the evaluation process in more 
detail? 

Yes I am available for discussions. 

Name 

Phone 

No I am not available for discussions. 

If you have any questions regarding this questionnaire, please contact Keith Kelley 
at (713) 282-7303. If possible, please return completed questionnaires within one 
week of receipt to: 

Keith Kelley 
MC 6606 

IBM Federal Sector Division 
3700 Bay Area Blvd. 

Houston, Tx. 77058-1199 
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Definitions 

Expert 

The person who provides the knowledge that is to be captured in the 
Expert System. 

Inference engine 

Processes the knowledge structures to infer a set of output facts from a 
set of input facts. Examples of commercial systems are CLIPS and 
ESE. 

Knowledge structures 

Declarative part of the Expert System which represents the knowledge 
(typically called the Knowledge Base). Examples are frames and rules. 

Problem space 

The total number of cases which could potentially be addressed by the 
Expert System. 

Problem space coverage 

The percentage of the problem space that is addressed by the Expert 
System. For example, if the Expert System is supposed to be able to 
diagnose 100 malfunctions, but the total number of malfunctions is 
known to be 200, the problem space coverage is 50%. 


Questions 

1. What is the name of the Expert System you were/are involved with? 


2. Are you a user of the Expert System or the manager of a department which 
uses the Expert System? 

a. User of the Expert System 

b. Manager of a department using the Expert System 

c. Other _ 

3. Is the Expert System operational or is it a prototype? 

a. Operational system b. Prototype system 

4. Briefly describe what the expert system does. 


5. What field does the problem belong to? 


a. 

Aerospace 

8- 

Medical 

b. 

Financial 

h. 

Personnel 

c. 

Information Systems 

i. 

Research 

d. 

Hardware 

j- 

Service 

e. 

Manufacturing 

k. 

Software 

f. 

Marketing 

1 . 

Other 
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6. Which of the following items best describes the kind of problem the Expert 
System addresses? Please indicate primary purpose with a and check all 
other applicable purposes (if any). 

a. Design - Configuring objects under constraints 

b. Repair - Executing plans to administer prescribed remedies 

c. Control - Governing overall system behavior 

d. Planning - Designing actions 

e. Diagnosis - Inferring system malfunctions from observables 

f. Debugging - Prescribing remedies for malfunctions 

g. Prediction - Inferring likely consequences of given situations 

h. Monitoring - Comparing observations to expected outcomes 

i. Instruction ■ Diagnosing, debugging, and repairing behavior 

j. Interpretation - Inferring situation descriptions from sensor data 

k. Classification - Categorizing objects by properties 


7. How much of the problem space is the Expert System expected to cover? 


a. 100% 

b. > 99% 

c. 95% to 99% 

d. 90% to 95% 

e. 80% to 90% 


fi 60% to 80% 

g. 40% to 60% 

h. Other % 

i. I don't know 


8. What is your estimate of the problem space coverage actually provided by the 
Expert System? 


a. 

Same as expected 

f. 

80% to 90% 

b. 

100% 

g- 

60% to 80% 

c. 

> 99% 

h. 

40% to 60% 

d. 

95% to 99% 

i. 

Other 

e. 

90% to 95% 

j- 

I don't know 


Questions 9 through 1 1 are concerned with the percentage of problems within the 
problem space (covered by the Expert System) that are answered correctly. 

9. If human experts currently perform (or previously performed) the task, how 
often is the expert(s) expected to give the correct answer? 


a. 

Task not performed by human 

f. 

80% to 90% 


b. 

'Correct* defined by expert 

g- 

60% to 80% 


c. 

> 99% 

h. 

40% to 60% 


d. 

95% to 99% 

i. 

Other 

_% 

e. 

90% to 95% 

j- 

I don't know 


How 

often is the Expert System expected to provide the correct 

answer? 

a. 

100% 

f. 

60% to 80% 


b. 

> 99% 

g- 

40% to 60% 


c. 

95% to 99% 

h. 

Other 

_% 

d. 

90% to 95% 

i. 

I don't know 


e. 

80% to 90% 
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1 1. What a your estimate of how often the Expert System actually provides the 
correct answer? 



a. 

100% 

f. 

60% to 80% 


b. 

> 99% 

g- 

40% to 60% 


c. 

95% to 99% 

h. 

Other % 


d. 

90% to 95% 

i. 

I don't know 


e. 

80% to 90% 



12. 

Was 

the expert(s) a member of the user organization? 


a. 

Yes 

c. 

User organization provided 


b. 

No 


some expertise 

13. 

Was 

the developer(s) of the Expert System part of the user organization? 


a. 

Yes 

c. 

Some development provided . 


b. 

No 


by user organization 

14. 

Why do you believe the results that the system 

gives? 


a. 

Expert says it is correct 

e. 

User acceptance 


b. 

Participated in evaluation 

f. 

I don't trust the results 


c. 

Someone I trust did evaluation 

g- 

Other 


d. 

Personal use and checking 


- 

15. 

How 

reliable is the Expert System required to be? 


a. 

Trusted with human life 

d. 

Assists the expert 


b. 

Trusted with mission objec- 

e. 

Assists the user 



tives 

f. 

Other 


c. 

As reliable as the expert 



16. 

Does the Expert System seem to be more reliable or less reliable than conven- 


tional systems that are in use? 




a. 

Significantly more reliable 

f. 

Less reliable 


b. 

More reliable 

g- 

Significantly less reliable 


c. 

Slightly more reliable 

h. 

No comparison is available 


d. 

Similar reliability 

i. 

I don't know 


e. 

Slightly less reliable 



17. 

How many people are expected to make use of the Expert System? 
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18. How frequently are the (expected) users actually using the system? (Numbers 
may add up to more than 100% if the actual number of users is greater than 
the expected users.) 

a. % use the system more than expected 

b. % use the system about as much as expected 

c. % use the system less than expected 

d. % do not use the system 

If you were not involved with evaluating the Expert System, please leave the 
remaining questions unanswered. 

19. How much effort was expended by the evaluation team in evaluating the cor- 
rectness of the Expert System? person/months. 

20. What testing activities were performed on the executing system before the 
system was delivered to the users? (indicate any that apply) 

a. N'o evaluation was performed d. User acceptance 

b. Checked by expert(s) e. System run in parallel 

c. Compared with expected f. Other 

results 

21. If more than one expert was available for consulting, how often did the experts 
agree on what results the Expert System is supposed to provide? 

a. No expert was involved c. Always agree 

b. A single expert was involved d. Agree % of the time. 

22. Compared to conventional software testing efforts, how difficult was the evalu- 
ation of the Expert System? 

a. Trivial d. Hard 

b. Easy e. Impossible 

c. Medium 

23. Many people feel that some development issues are more of a problem with 
Expert Systems than with conventional systems. Which (if any) of the fol- 
lowing were problems during testing of the Expert System? 

a. Understandability and readability of knowledge structures 

b. Determining test coverage for knowledge structures 

c. Modularity/ Design of knowledge structures 

d. Knowledge validation 

e. Analysis of Certainty Factors 

f. Validating the inference engines 

g. Real-time performance analysis 

h. Complexity of the Problem 

i. Certification 

j. Other 
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